Multi armed bandit problem: some insights
ثبت نشده
چکیده
Multi Armed Bandit problems have been widely studied in the context of sequential analysis. The application areas include clinical trials, adaptive filtering, online advertising etc. The study is also characterized as a policy selection which maximizes a gambler’s reward when there are multiple slot machines that are generating them. It is under this framework, that we describe the model and develop subsequent results. Lai and Robbins [8] studied this problem under statistical settings and developed asymptotically efficient adaptive allocation rules.
منابع مشابه
The Irrevocable Multiarmed Bandit Problem
This paper considers the multi-armed bandit problem with multiple simultaneous arm pulls and the additional restriction that we do not allow recourse to arms that were pulled at some point in the past but then discarded. This additional restriction is highly desirable from an operational perspective and we refer to this problem as the ‘Irrevocable Multi-Armed Bandit’ problem. We observe that na...
متن کاملThe Blinded Bandit: Learning with Adaptive Feedback
We study an online learning setting where the player is temporarily deprived of feedback each time it switches to a different action. Such model of adaptive feedback naturally occurs in scenarios where the environment reacts to the player’s actions and requires some time to recover and stabilize after the algorithm switches actions. This motivates a variant of the multi-armed bandit problem, wh...
متن کاملVolatile Multi-Armed Bandits for Guaranteed Targeted Social Crawling
We introduce a new variant of the multi-armed bandit problem, called Volatile Multi-Arm Bandit (VMAB). A general policy for VMAB is given with proven regret bounds. The problem of collecting intelligence on profiles in social networks is then modeled as a VMAB and experimental results show the superiority of our proposed policy.
متن کاملOnline Multi-Armed Bandit
We introduce a novel variant of the multi-armed bandit problem, in which bandits are streamed one at a time to the player, and at each point, the player can either choose to pull the current bandit or move on to the next bandit. Once a player has moved on from a bandit, they may never visit it again, which is a crucial difference between our problem and classic multi-armed bandit problems. In t...
متن کاملCognitive Capacity and Choice under Uncertainty: Human Experiments of Two-armed Bandit Problems
The two-armed bandit problem, or more generally, the multi-armed bandit problem, has been identified as the underlying problem of many practical circumstances which involves making a series of choices among uncertain alternatives. Problems like job searching, customer switching, and even the adoption of fundamental or technical trading strategies of traders in financial markets can be formulate...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011